Skip to content

in_kubernetes_events: Efficiently stream kubernetes events via watch#8351

Merged
edsiper merged 3 commits intofluent:masterfrom
ryanohnemus:feature/in_k8s_watch
Jun 24, 2024
Merged

in_kubernetes_events: Efficiently stream kubernetes events via watch#8351
edsiper merged 3 commits intofluent:masterfrom
ryanohnemus:feature/in_k8s_watch

Conversation

@ryanohnemus
Copy link
Contributor

@ryanohnemus ryanohnemus commented Jan 4, 2024

Change in_kubernetes_events plugin to watch kubernetes events after requesting the event list. Instead of polling for the full event list every 500ms (default), an initial full events list is requested and then is a watch is requested. The watch will create an efficient http chunked stream that will push events as they are added, modified, or deleted from the cluster. The interval_sec and interval_nsec plugin config options now act as a reconnect timer if the watch stream is ended, instead of timer to re-poll the k8s cluster.

Potentially Breaking: this will require the kubernetes role used by fluent-bit to have watch permission in addition to the current list and get permissions.

Fixes #8315

Leaving in draft as this is dependent on both #8316 & #8323, will rebase and move out of draft after those are reviewed/merged.


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • [ X] Example configuration file for the change
[INPUT]
    name          kubernetes_events
    tag           k8s_events
  • Debug log output from testing the change
  • Attached Valgrind output that shows no leaks or memory corruption was found

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • Run local packaging test showing all targets (including any new ones) build.
  • Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • Documentation required for this feature

Backporting

  • Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

@ryanohnemus
Copy link
Contributor Author

@edsiper - just force pushed a new version of this. I still have this PR draft mode because it was branched off of 2 other PRs: #8316 & #8323. I was assuming it would be easier to review those each individually and then I'd rebase this one with a smaller set of changes to review, but if you'd prefer to just do them all in one change PR review/merge, I can move this pr to ready to review.

@ryanohnemus ryanohnemus requested a review from edsiper January 10, 2024 19:53
@edsiper edsiper added this to the Fluent Bit v3.0.0 milestone Jan 10, 2024
@ryanohnemus ryanohnemus force-pushed the feature/in_k8s_watch branch 2 times, most recently from 8639849 to 589a7cb Compare January 16, 2024 20:16
@ryanohnemus
Copy link
Contributor Author

@edsiper @patrick-stephens this didn't get merged in with 3.0 but it is ready to go. Can this be added to the next milestone/release?

@patrick-stephens
Copy link
Contributor

Did you carry out the review comments from @edsiper ?

@ryanohnemus
Copy link
Contributor Author

@patrick-stephens yes, unless I missed something they should all be marked resolved. I rebased and force pushed after some dependencies had a merge conflict, i think messes with the comment visibility

@edsiper
Copy link
Member

edsiper commented Apr 9, 2024

@pwhelan do you think you can take a look at this one ?

Copy link
Contributor

@pwhelan pwhelan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a lot of comments regarding the manner in which the events are continually processed.

@ryanohnemus
Copy link
Contributor Author

Rebased and force pushed the following updates:

  • moved upstream and http client for stream to context to ensure we can clean this up on shutdown for no leaks
  • added 2 tests that are using monkey/server to mock the k8s upstream connection
  • chunked-streaming test is something that can be added later on, these tests were to show the no memory leak:
  • continuous wait on stream still exists, assuming this is ok since this the input is threaded
valgrind --leak-check=full ./bin/flb-rt-in_kubernetes_events 

[removed test output]

SUCCESS: All unit tests have passed.
==931== 
==931== HEAP SUMMARY:
==931==     in use at exit: 0 bytes in 0 blocks
==931==   total heap usage: 3,845 allocs, 3,845 frees, 2,002,206 bytes allocated
==931== 
==931== All heap blocks were freed -- no leaks are possible
==931== 
==931== For lists of detected and suppressed errors, rerun with: -s
==931== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

@pwhelan please take another look

(posting this before CI has fully run, so i will review any test failures if they show up)

@ryanohnemus
Copy link
Contributor Author

appveyor and macos build errors both appear to be flakes.

@ryanohnemus
Copy link
Contributor Author

@pwhelan @edsiper @lecaros - this fell out of the next milestone with the last few 3.0 releases but is still ready to go. Can this be added to the existing 3.0.7 milestone? Thank you

Potentially breaking change as it now requires the
rbac used by fluent-bit to have 'watch'.

Uses a k8s watch instead of http api polling to
stream k8s events from the kube api server

Signed-off-by: ryanohnemus <ryanohnemus@gmail.com>
Signed-off-by: ryanohnemus <ryanohnemus@gmail.com>
Signed-off-by: ryanohnemus <ryanohnemus@gmail.com>
@ryanohnemus
Copy link
Contributor Author

Rebased to fix merge conflict.

@edsiper @pwhelan @lecaros could this be tagged in the 3.1.0 milestone so it does not get missed? Thank you!

@patrick-stephens
Copy link
Contributor

We probably should update docs as well, particularly with the RBAC change. Could you link a docs PR @ryanohnemus ?

Do we have any int tests for this btw?

@ryanohnemus
Copy link
Contributor Author

@patrick-stephens added doc via fluent/fluent-bit-docs#1396

No int tests, but I added unit tests for the plugin in this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs-required ok-package-test Run PR packaging tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

in_kubernetes_events: Inefficient Defaults Lead To Kube API Spamming/Resource Drain & extra processing required in fluent-bit

5 participants